Model Selection

Efficient deployment

# Efficient deployment

Gemma 3 4b It Quantized.w4a16

A quantized version based on google/gemma-3-4b-it, using INT4 weight quantization and FP16 activation quantization to optimize inference efficiency

GLM 4 32B 0414 4bit DWQ

This is the MLX format version of the THUDM/GLM-4-32B-0414 model, processed with 4-bit DWQ quantization, suitable for efficient inference on Apple silicon devices.

Large Language Model Supports Multiple Languages

Spec-T1-RL-7B is a high-precision large language model focused on mathematical reasoning, algorithm problem-solving, and code generation, and it performs excellently in technical benchmark tests.

Large Language Model

Safetensors English

SVECTOR-CORPORATION

Qwen3 30B A3B Gptq 8bit

Qwen3 30B A3B is a large language model that has undergone 8-bit quantization using the GPTQ method, suitable for efficient inference scenarios.

Large Language Model

Whisper Large V3 Turbo Quantized.w4a16

An INT4 weight quantization version based on openai/whisper-large-v3-turbo, supporting efficient audio-to-text tasks

Speech Recognition

Transformers English

Llama 2 7b Chat Hf GGUF

Llama 2 is a 7B-parameter large language model developed by Meta, offering multiple quantization versions to accommodate different hardware requirements.

Large Language Model English

Qwq 32B Bnb 4bit

4-bit quantized version of QwQ-32B, optimized using Bitsandbytes technology, suitable for efficient inference in resource-constrained environments

Large Language Model

Llama 3 8B Instruct GPTQ 4 Bit

This is a 4-bit quantized GPTQ model based on Meta Llama 3, quantized by Astronomer, capable of efficient operation on low-VRAM devices.

Large Language Model

Moritzlaurer Roberta Base Zeroshot V2.0 C Onnx

This is the ONNX format conversion of the MoritzLaurer/roberta-base-zeroshot-v2.0-c model, suitable for zero-shot classification tasks.

Text Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase